Declassifying the Responsible Disclosure of the Prompt Injection Attack Vulnerability of GPT-3


Disclosed 05/03/2022. Declassified 09/22/2022.

If you'd like to cite this research, you may cite our paper preprint on arXiv here:
https://arxiv.org/abs/2209.02128

What is Prompt Injection?

The definitive guide to prompt injection is the following white paper from security firm NCC Group:
Exploring Prompt Injection Attacks by NCC Group (11 min read)
https://research.nccgroup.com/2022/12/05/exploring-prompt-injection-attacks/

Prompt Injection has been in the news lately as a major vulnerability with the use of instruction-following NLP models for general purpose tasks. In the interest of establishing an accurate historical record of the vulnerability and promoting AI security research, we are sharing our experience of a previously private responsible disclosure which Preamble made on May 3rd, 2022 to OpenAI.

Art image

May 3,2022: The Discovery, and Immediate Responsible Disclosure

Document

May 3,2022: OpenAI Confirms Receipt of Disclosure

document

May 4,2022: Provided Additional Examples

document

May 19,2022: Details Provided for Approaches to Mitigate the Risks

documentDocumentDocument
Document

Get our updates.

We’ll notify our community of new policy marketplace updates and AI safety features are released.

Worry less and do more with secure AI